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ABSTRACT 



This is a psychometric hoax paper, the purpose of 



which is to indicate once again the importance of cross-validation, 
particularly in the development of specially-xeyed inventories. The 
junior author and the new psychometric method play critical roles in 
the study. Appropriate credit and references are present, (Author) 
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It ia difficult to Vnaa where to begin with a study as teaiy specl&X 
and exciting as this. As suggested by the title, the purpose of the study 
was to iatprove the validity of a tailor-made scoring key by the application 
of a new psychometpic inethod to an old tried and untrue experimental design. 
Let ine then first sunmarize (D the background and (2) the procedure and 
? results of the study^ before getting into the details of the procedure-^particu- 
larly the new psychometric method. 
Badt^rouqd. 

There has developed over the past several years a body of literature in 
applied psychoinetrica that would Iwdlcate that, the empirical development of 
tailor-made scoring keys is to be preferred to "stcttj-b ought" and/or a Eriorl 
scoring keys. F«jrther, the apparent plateau, or perhaps ceiling, for validity 
coefficients also aeema to suggest the pressing need for breakthroughs in new 
and innovative psychometric methods and instruments. 

However, as Kurts pointed ou. so well In 1948, too often the ^^ishes ar.d 
hopes of tba practitloner/deve3oper end/or the cons»«i>er manifest themselves in 
a strange foYm-<5f selective perception and -self -deception in the evaluation of 
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th* effectiveness of euch taiioc-tnacle keys, i.e., the acceptance of self- 
fulfilling "research" via the foldkack design (that old tried end untrue 
d.3iga in which the tailorr«ede key is "tested" by re-applying it to the 
tt«ne data base froo which it was originally developed). 
Procedure aud Reeults 

Segardlog the current study, in the data collection phase, 100 special 
aukjecta responded to a ap^cial inatru««nt (100 2^altemative Items) through 
a apeaal response B.ode. A new- psychometric «ethod was employed extensively 
1« the data coUectiou. Then, an enually special external criterion was devel- 
opsid with which the tailor-made key was subsequently developed. 

Following data collection, there was accomplished an item analysis utili.- 
Ut the special external criterion. The ite. anaysis identified 24 of the 100 
Itema for the special tailor-Bade key.. 

Application of this key in the n&m data base resulted in a biserial 
correlation of .99+. At this point the authors were ex';re«ely encouraged, aa 
one adglit well imagine, both ir terms tf the new psychometric method and in 
terms of the key and the instrument. 

However, it was decided to conduct the "academic nicety" of cross-valida- 
tian. Application of the key in cross^velidation resulted in a diaappointing 

biserial correlation of .19. 

The first coefficient reported is (clearly) significant beyond tha ,01 
level; the second coefficient reported U not significant at the .05 level. 
Why the diecrepancy? H.«. acco^mt for the sbrinkaget Or better, the inflation? 

With this ov^irviaw, perhaps we can retrace the research nethodology so as 
to explain and better ttnderstand ihla discrepancy. 
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«tric «ti,od to old trted »d ..t.r.e «fexi...tal de»l8» to i^ro.. tte 

«=e tte specie.^, self 

f«lla=loas. Wt f«ol.»tlr.g ^-esult. that ... *tal»ea c«..-v^.l4.tl» 
docs iKjt follow item analysis. 



Procedure 
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. ,ith thla last revelation in «l«cl, let «a examine in .ore det.ll the 
p^ocdure generally, and t.e ... psychometric .ethod speci^cUy. An^. -HI- 

, « * . ,«.hrr " 1-t m elv«i credit where credit is due regard- 
the traditional "senior autnor, let m bxv«, ci. 

mg tbc relative contrlbutione of tho senior author, the aecond author, and, 
perhaps particularly, the junior author. 
nfttfl_ Collection 

The 3ub5.cts, the in8tr.m«nt (the new psychometric method), and the 
external criterion follow, A. previously indicated, the sub.lecte, the instm- 
joent, and the criterion wero all quite special. 

.subjects, indeed the s.biecta ve.e .peclal; in fact, they did not exist 
except in the rich (however bizarre) imaslriationa of the aathoxa= TV.ey are 

If wa will. It credit must !i« token, 
surely hypothetical, ocience fictional II y-> "i^-i- 

th. oer.ior author ass»e. the credit for the ofeoi.i subjects. 

«e» P.vcho»trlc. Jjahodl- The special ln.tt>me„t I ao« 

hold m ^ hand. Aa y«» ee«. it la a Ur.lt..d Stat., penny. cir;a 1971. 
<H.t betas .^ch of . jr,nt.u«.. th. projeot .a. ran on an extre^ly .odeat 
and limited budget.) lou »U1 note that th. coin ha» t»o aides. 1.... two 
altamatlvJa. K flip of the inatr^it b, tJ« .l-mlor author eatabUahed the 



conveatton aa to ^..-ther bea^a voul^ be alternative A ot B. 0^ It turned 
out. head« waa B.) The aacm,d author than laid the "ioatr««..at" upon hia 
thunft «Dd proceeded to flip the coin 100 ti«ca for each of the 100 hypo-- 
thetlcal aubjecta. If the coin came up heada. a B reaponse wae .ecordei; 
if the coin catne up talla, an A response ^aa recorded. The Unlveralty 
geoeroualy provided the 100 answer eheeta (tha 100 subjacta). 

TI.la coin flipping then vaa the "new paychon^ctric ;aethod." &b I «111 
give an appropriate na«e to the paychomatric method later in the p.per. 
I «m at this juncture <de«cn8trating .mu««al aolf-dlaclpline) realat .he 
temptation of describing the «.ethod aa "a aeries of one-tailed teata." or 
"the uae of a digital co:4>«ter," or eveu «<Hxe of cumt^ative aide eff.ctc." 

In tUia «.aauer, that la. with this apeclel instrument, and^lts attendant 
^ paychou^trlc method, lOO 2.^.tematlve responaea were generated for each 
of the 100 special auhjecta. I^et ua now turn to the apecial external criterion 

used In the study- 

cri terion, following the develop«,ent of the iOO lOO-rcBpoose anawer 
«,beet"a^ a atep-wlae algorithm weaj^sed to develop the apecial external criterion 
Specifically, the .enior author (in the pedagological apirit and tradition o£t«n 
suggested by students) atcod at the top of aa (outalde) otaircaae and allowed 
the iOC anawer leeta to tumble and float to the beae of the staircase. Utter- 
ing the varloua individual .taira in the proceaa. At tbia point, the junior 
author (eager to please, aa junior authora are wont) recouped the answer aha^ta 
in vhatever order (i.e., random) they had happened to fall, d ««PPcae one 
could refer to thla as a "least-stalra solution.") 

At thla point, the second author atratified the 100 anawer sheets Into 
2 stacks of 50 each. I.e., he aoxted, odd>even. Then the junior a«.thor fMpped 
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another penny (the cthet half o£ tbd budget) to ecu'r.ijflh which stack of 50 
vould be the high critsrion gtc\xp stx-J which stack of 50 would be the low 
crltcrltKi group. Follcwing this te~ecplicatloo of the new psychonvetrlc method, 
in a blaze of scientific rigor, the junior flnthcr once again flipped tbe cola 
to establish vhich half of the high group snd which hall of the low grottp 
would be the item analyelo (priwary) group and vhich hzli of each critarlon 
group would be the cross-validation (holdout) group (yet another aprl.i cation 
of the net* paychonjetric method I) la thia maimer, (> sets of 25 anavet eheets, 
i.e., high prljnary, low orioary, high hold»ut, and low locldoMt were established 
for the etudy. (We hcd originally planned to use eever&l vandom tablea in 
the criterion flevalopwfcot phase. I* we had, I suppose I coyid have xicw referred 
to these tools as "a aaober of rsndoa tablvis.") 

Sunanary of data collection . Through the application of a "new psychosaetrlc 
method" (coin flipping), a data uatsa ol 100 2'-aitsmative recponseq waa generated 
for 100 (hypothetical) subjects. Utilizing a aimilarly generated epaci.^l 
external criterion, theae 100 answer sheets w*re further eub-dividcd into 4 
analysis groups as per any tradilioual item analysis project. 
D ata Analysis 

Tliere were three phases to the data analysis of this research, i.6., 
(1) item analysis, (2) foldback, and (3) cross-validation 

Itaa Ana lysis, The 100 iteajs in tTia ItSBi pool were item analyzed using 
the procedure described by Lawahe and Baker (1950) with the epaclai external 
criterion as pravdonsly described. In the item analysis, there were 25 Iti the 
high group and 25 in the Im group. Alpha of ,05 waa used to Identify the 
"discrlmintttics" Itoas for InciusAon in thft "special" kay. 



ypldback. The "ttems" aurviving the item atmlysia were (re) applied 
to thfi answer ahfeftts of the item anaiyeis group. The predictive validity of 
the key was doeumasted by blserial corrslstion. 

Cr OSS-Validation . Hovresver, Sot those more laUreated in thje better 
{rather than the mors fulfilUne) estiniate of the relttioaship between the 
derived key and the external criterion, the itetus surviving the item analysia 
were scored in the holdout groups mf 25 high answer sheets and 25 low answer 
aheeta. Again, blaerlal correlation waa obtained to quantify the relatlonahlp 
between the special key and the critsrion» i.e., the predictive validity. 

Results , 

Item analyais procedure identif ied 24 items (chance would have, been 5) 
which dlscriainated between the high and low groups at. or Vaycad the .05 
level. No doubt you will be interested in which items "came throagh." They 
were itema 7, 10, 12, 16, 20, 21, 27, 31, 34, 35, 41, 42, 43, 59. 64, 66, 68, 
72, 77, BI, 84, 88, 91, and 92. These item numbera are <js meaningful in 
context 38 they are out of context <or vica versa?). 

implying these 24 items back upon the orlj^nai sample in which fchey were 
derived^ the obtained biserlal correlation was .99+. No doubt, rounding error 
prevented the completely self-fulfilling prophecy. This was tiost encouraging, 
as this obtained coefficient is clearly off zero beyond the ,05 level. 
(Consider here for a moment those of your acquaintance and/or your employ 
using this foldback design and at this point southing such quasi-professional » 
but sage, things as "Of course, these results should be interpreted with sons 
cautltm.") ' 
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rtofortt«nately> vfcf.n the key ©ppii^d I'o the holdout &us^l^ 

of 50, the encouragiag coefficient o? ♦S^Jf iihr^nk nilghtiy. In fact:, it 
shrar4c back to A9 (aot glgxiificantiy off ^ero the .OS levnJ.), Toe bads 
va felf. v?e were on to something — ^both iu Lo?^s o/; a n?w psychooieCric tRftthod 
aad ia terms tf the operational utility of the k^y and iustrusi^ntf 

For those of you who are p^ychoBjettlc purists, yoa be c-nciterj to 
leam that the obtained odd-evetJ> correcC'«d reli>ifcirity of the ke/ v^aa .29 
(M « iOO) 

Discussion and Coi> elusion 

• ✓ 

At this pointy the reason for the obtained dl3cr£|»ancy batvcen the 
foldback results and the cross-validation (hopefully) should be perfectly 
clear* The v^hole thing was a hoax; the old trJed and untrys design, iv-a. 
foldback, really did (and does) make something out of riOthlng— la this case « 
out of soiDethiftg gli^htly leas than nothing; Little or no further discussion 
eeeos necessary. In a sense (no pun intended) Oaretcn*€ classic paper (1952) 
has been ra-execuJred* At the suggestion of the junior author (atill eager to 
help), I call your attention to the recent treatujent of this subjectjby the 
senior author (Blmaenfeld, 1972). It Be^m^ (perhape cruc31y) clear once again 
that (1) the application •f the key to the control group is th« acid teat of 
the quality of the key and (2) the (re) application of the key to the original 
group ie but a half-acid test of the quality of the key* 

Oae ^ould think, that this point haa been well made often enough > but aa 
an applied psydtologlst dealing with atudento and practitioners of buBtnecs 
adiBiniotratlon «ad/ox educational administration^ it Is painfully cUar to m 
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that the foldbaek design still re^^aine very much lu vo^.e, (Fct a raceat 
laaldioiis execution of the foldback expexli»ental design, oee, fox example, 
Sovak, 1970.) It is for that reasou that I concinu« to believe that it l3 
appropriate to heat hon>e the point of exoa.-val.Uatiov>, i.e,, let's have no 
oiore of thie half-acid research! 

Ob yaa. there eeem to be two pieces of buslneea yet to ke ha^^dled. 
%ese concern the junior author and the naming of the new pcycb.>n«tric x^ethod. 

Eeg^rdlng the junior author, he is now 5H yeara old. ^t the tl»e of the 
study he was 3^ years old. <lhe publication lag takes its to31 on all of «e.) 

Regarding the naming of the new psychometric Biethod. you will recall, 
that the explicit operational mechanics of the procedure were to lay .the coin 
upon one'a thu«b and flip. Conaidering the non- consistency between the 
mppings (i.e., the application of the new psychoraetrlc method) of the second 
^.d junior authors, and, if you will not think it too flippant of ite, 1 
consider it uncomonly ard punichingly appropriate to call the new psychotaatric 

nethod: 

"■THF. i-ETHOD OF NON-COaSTANT THUMBS" 
tod, at the risk of losing our place in paychometric Uiatory, the future 
application of this uiethod is not recommended. 
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